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(57) Abstract: This invention describes an audio-visual based method and system for early drowning detection system. In this 
invention, a number of cameras (100, 200, 201) are mounted on top of a swimming pool (101). These cameras (100, 200, 201) 
are used to monitor swimmers in the pool (101) together with the aid of an array of microphones (103). Similarly, the microphone 
array (103) is mounted above the water to cover the entire swimming pool (101). Based on the motion and activity of the swimmer 
detected through the video camera (100, 200, 201), the swimmer's condition is automatically analyzed. Such automated video 
analysis includes building the visual background model of the pool (101), detecting the presence of the swimmers in the monitored 
areas (106), estimating the number of swimmers inside the monitored area (106), tracking the swimmers and analyzing the behaviour 
of each tracked swimmer in terms of body orientation, moving direction and motion patterns. In addition, the microphone array 
(103) is deployed to pick up audio signals (104) originating from distress calls. Once the system detects the presence of a potential 
drowning, both visual and audio alarms will be activated to draw the attention of the person in charge for further confirmation and if 
necessary to provide necessary follow-up rescue actions. 
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browning Early Warning System 

Background of the invention 

5 The present invention relates to an audio-visual system 

capable of monitoring a swimming pool and an automated method 
of analyzing swimmers 1 conditions to detect potential 
drowning incidents based on the received audio and video 
signals. A number of video cameras and an array of 

10 microphones are strategically placed above the water around 
the pool such that the entire swimming pool can be covered. 
Through the processing of the video sequence and/or the audio 
signal, abnormal conditions will be detected and considered 
as potential drowning cases. Such a system serves as an aid 

15 to the life-guards on duty or as a distress call to alert the 
attention of nearby people. 

At present, there does not exist a viable automated system 
that monitors a swimming pool for a person in distress by 

20 analyzing swimmers 1 behavior. However, there are so many 

swimming pools worldwide, located in public places, private 
houses, condominiums and hotels. Unlike public pools, most 
private pools do not have lifeguards on duty. Therefore, a 
system that can provide monitoring assistance to such pools 

25 would be useful. In addition, a system capable of alerting 
the life-guards on duty of potential distress or drowning 
incidences would be helpful. Timely rescue of the person in 
distress or the drowning victims is critical in saving lives 
and in preventing irreversible injuries. It is thus highly 

30 desirable to have an automatic system capable of detecting 
distress or potential drowning accidents at an early stage. 



l 
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Related art 

There are a few prior arts describing the monitoring of 
swimming pools to detect drowning cases. Most prior arts 
5 describe the use of floating devices that will sound an alarm 
if there is a wave (US Patent No 3953843, 4510487, 4775854), 
an echo (US Patent no 5274607, 5369623) or when the water is 
disturbed (US Patent no 3969712 and 5923263). These prior 
arts are mainly useful in detecting people entering swimming 

10 pools, presumably unauthorized. US Patent No 4747085 and 
4932009 describe the use of an array of transducers 
(typically ultrasonic) to detect the rate of motion of the 
swimmer and if the rate is suspicious, drowning is assumed. 
The US Patent No 5043705 describes the use of ultrasonic 

15 radar to scan the bottom and top layer of the swimming pool 
to detect motionless bodies, assumed to be possible drowning 
victims, while US Patent no 5043705 detects motionless bodies 
by using sonar scanning upward from the bottom of the pool. 
In another approach, US Patent no 5097254, 5408222 and 

20 5907281 rely on a device worn by the swimmer. If the swimmer 
goes below a certain depth for an extended duration of time, 
an alarm will be generated or a float activated. US Patent no 
6111510 uses a microphone system to detect heartbeats and 
breathing sounds and measures the interval between the 

25 absence and presence of these sounds to determine the 
possibility of drowning. 

The only patents that use video cameras for processing are as 
follows : 



30 



1. 5886630 describes the use of video cameras to detect 
motionless bodies at the bottom of a swimming pool. The 
presence of such motionless bodies is used as an indication 
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of possible drowning. 

2. 6133838 describes a method which involves the 
installation of multiple underwater cameras at the side of a 
5 swimming pool. Drowning is assumed if a body is detected to 
be moving slowly or motionless underwater beyond a 
predetermined time . 

None of them however provides any description of methods of 

10 analyzing swimmers 1 behaviors. Most approaches assumed that 
absence of motion or slow motion is an indication of 
drowning. There is however a significant possibility that the 
actions of standing, swimming on the spot, diving or playing 
underwater could trigger false alarms. To reduce such false 

15 alarms, the systems above have to be installed sufficiently 
deep. In such a case however, distress or the early drowning 
of a child at shallower places cannot be detected; moreover 
early distress calls by a swimmer and the typical initial 
struggles on the water surface before sinking are ignored. By 

20 the time the body sinks down to such a depth, there is only 

little time left to rescue the victim. Furthermore, the above 
systems require the placement of underwater cameras which 
requires significant and expensive installation procedures 
which also results in an interruption of the pools 1 

25 operation. Therefore, a system of this type leaves much to be 
desired. 

Summary of the Invention 

30 It is an object of the present invention to provide a system 
and method for monitoring a swimming pool to indicate a risk 
of drowning of swimmers, which system and method only make 
use of above-water video cameras with optional microphones to 
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cover a swimming pool and which can reliably indicate a risk 
of drowning of swimmers. 

The object is achieved by analyzing swimmers' conditions 
5 which detects distress or possible drowning cases by making 
use of the cues of swimmers 1 behaviors, such as body 
orientation, area of the body above water, moving direction 
and motion symmetries, the image features of the surrounding 
areas of the swimmers (for example, water ripple patterns), 
10 sudden changes in the swimming pattern, irregular activity, 
and calls for assistance. 

The present invention differs from the existing methods at 
least in the following aspects: 

15 

1. The system is installed above the water, making 
installation and maintenance easy and inexpensive. The 
system can be installed in existing swimming pools without 
the troublesome and expensive procedures of complete water 

20 drainage or costly renovations for cabling and 

installation of underwater devices. 

2. The system is monitoring distress and early drowning 
signs such as irregular swimming patterns, signs of 

25 struggling, sudden submerging of body etc. and calls for 

assistance. It does not solely depend on the motionless 
cue . 

The invention provides a method for monitoring a swimming 
30 pool to indicate a risk of drowning of swimmers, the method 
comprising : 

taking a plurality of subsequent images of a monitoring 
region at least partly containing a water surface of water 

4 
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contained in the swimming pool by means of a camera outside 
the water at a plurality of predetermined subsequent moments 
in time, 

in each image, detecting the presence of swimmer image 
5 portions each of which shows a swimmer present in the image, 
processing each detected swimmer image portion, so as to 
assign to the respective swimmer image portion a 
characteristic two-dimensional geometrical figure, wherein 
the geometrical figure is characterized by at least one 
10 predetermined geometrical attribute, 

assigning to each geometrical figure a figure position 
of the geometrical figure, wherein the figure position 
corresponds to a position of the detected swimmer image 
portion in the image, 
15 for at least one pair of two subsequent images, i.e. of 

a presently processed present image and a previously 
processed previous image, comparing values of at least one 
out of the figure position and the at least one geometrical 
attribute of the present image to that of the previous image, 
20 so as to detect a change in the figure position/ geometrical 
attribute of the present image as compared to the previous 
image, 

based on the detected change in the figure position/ 
geometrical attribute of the subsequent images, assigning to 
25 the corresponding swimmer either a drowning condition 

indicating that there is a risk of drowning of the swimmer or 
a safe condition indicating that there is no risk of drowning 
of the swimmer, and 

outputting an output signal if a drowning condition is 
30 assigned to the swimmer. 

The invention further provides a system for monitoring a 
swimming pool, the system comprising: 
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at least one camera being installed for taking a 
plurality of images of a monitoring region at least partly 
containing a water surface of water contained in the swimming 
pool at a plurality of subsequent moments in time, the camera 
5 further being installed outside the water, and 

a computer being coupled to the camera to receive an 
image taken by the camera and being installed to process the 
image, wherein the computer comprises: 

a means for detecting the presence of swimmer image 

10 portions each of which shows a swimmer present in the image, 
processing each detected swimmer image portion, so as to 
assign to the respective swimmer image portion a 
characteristic two-dimensional geometrical figure, wherein 
the geometrical figure is characterized by at least one 

15 predetermined geometrical attribute, 

a means for assigning to each geometrical figure a 
figure position of the geometrical figure, wherein the figure 
position corresponds to a position of the detected swimmer 
image portion in the image, 

20 a means for comparing, for at least one pair of two 

subsequent images, i.e. of a presently processed present 
image and a previously processed previous image, values of at 
least one out of the figure position and the at least one 
geometrical attribute of the present image to that of the 

25 previous image, so as to detect a change in the figure 
position/ geometrical attribute of the present image as 
compared to the previous image, 

a means for assigning to the corresponding swimmer, 
based on the detected change in the figure position/ 

30 geometrical attribute of the subsequent images, either a 
drowning condition indicating that there is a risk of 
drowning of the swimmer or a safe condition indicating that 
there is no risk of drowning of the swimmer, and 

6 
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a means for outputting an output signal if a 
drowning condition is assigned to the swimmer. 

Brief discussion of the drawings 

5 

Fig. 1 shows a system setup for an embodiment of the system 
according to the present invention; 

Fig. 2 illustrates three overlapping images of three 
10 different sub-regions of a monitoring region, the images 
taken by three different cameras; 

Fig. 3 shows (a) a typical image of a swimming pool and 
histograms of the (b) Red, (c) Green and (d) Blue color 
15 component of the image; 

Fig. 4a shows a background scene of a swimming pool; 

Fig. 4b shows a segmentation of the image of Fig. 4a with 
20 four clusters; 

Fig. 4c shows a 3D (4D) scatter plot of the background scene 
of Fig. 4a and 4b in RGB color space, where each cluster is 
shown in different color (gray scale) ; 

25 

Fig. 5a shows one time series of intensity values of a pixel 
where ripples occur, for a plurality of subsequent images 
(frames) ; 

30 Fig. 5b shows one time series of intensity values of a pixel 
that the swimmer goes through, for a plurality of subsequent 
imaaes (frames) ; 

7 
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Fig. 6 (a) to (d) show images of swimmers in a swimming pool, 
with detected swimmer image portions, wherein the contour of 
each detected swimmer image portion is approximated by best- 
fit ellipse; 

5 

Fig. 7 shows a state flow diagram for the detection of 
potential drowning; 

Fig. 8 shows the orientation of the body of a tracked swimmer 
10 for a plurality of subsequent frames taken by a video camera; 

Fig. 9 shows the rate of orientation change obtained form 
Fig. 5; and 

15 Fig. 10 shows an illustration of the overall process flow of 
an embodiment of the system according to the present 
invent ion . 

Detailed description of the preferred embodiments 

20 

Hardware Setup 

A number of video cameras are mounted above the swimming 
pool and are located around the pool such that these cameras 

25 cover the view of the entire swimming pool. Typically, each 
camera is mounted high up and at an angle viewing downward 
to the pool area so as to cover a large field of view, 
reduce occlusions of swimmers and minimize the perspective 
foreshortening effects. All the cameras are enclosed in a 

30 rain-proof compartment suitable for outdoor setting. Figure 
1 shows an example of one such camera 100. There is an 
overlap in the view of each camera 100 and the two cameras 
200, 201 at its side. An example is shown in Figure 2 in 

8 
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which 3 video cameras 100, 200, 201 are used to cover the 
entire view of the pool 101- The view of camera 100 and 
camera 200 overlaps as well as camera 200 and camera 201. 

5 These video cameras 100, 200, 201 capture the video 

sequences of the activities inside the pool 101. All the 
cameras 100, 200, 201 are identical and each is responsible 
for monitoring a portion of the pool 101. The video 
sequences obtained will be processed by computers 102 to 
10 analyze for potential drowning cases. 

In the following section, the method of analyzing one 
swimmer's behavior is discussed within the context of a 
single camera 100, the typical arrangement of which is 
15 depicted in Figure 1. Similarly, the array of microphones 

103 is located such that sound from any part of the swimming 
pool 101 can be picked-up reliably. The audio signal 104 
will be enhanced to increase the signal to noise ratio. This 
is also depicted in Figure 1. 

20 

The video signal 105 and audio signal 104 are processed by a 
computer 102 or a cluster of computers. The video signal 105 
can be sent to a dedicated monitor for viewing by the person 
in-charge either via a wired line or a wireless link. 

25 Similarly, the audio signal 104 can be sent to the speaker 
via either wired or wireless means. When an abnormal 
condition is detected, which could possibly be a distress or 
drowning incident, the person in-charge will be alerted. The 
person in-charge can view the video signal 105 and hear the 

30 audio signal 104 to decide whether it is a genuine drowning 
incident and if it is, further rescue operation would ensue. 
If there is no response from the person in-charge after a 
short duration of time, a loud audible sound can be emitted 
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to alert people nearby the swimming pool 101. 
Operating Principle 

5 The video data acquired from the video camera 100 is being 
sampled and digitized and the digitized data is made 
available to the computer 102. Typically, the number of 
samples or frames acquired is between 1 to 8 per second. The 
video data being processed is broken into short sequences. 
10 Typically, the duration of each sequence is between 20 

seconds to 60 seconds. The operation for each sequence is 
similar and will be as follows. 

For each sequence, a swimmer detection module is launched to 
15 detect and count the number of swimmers. This process is 

divided into three stages, namely global statistical model 
generation, segmentation of swimmers and updating the 
statistical model. These will be described in more detail 
below . 

20 

1 . Global statistical model generation 

Figure 3 shows the histogram 300 of a background scene of a 
typical swimming pool 101 in RGB (red, green, and blue) 
color space. These data display the behavior as expected: 
the black strip and water pixels form two fairly well 
defined peaks. This observation inspired us to employ the 
kernel-based mean shift procedure [1] to perform mean shift 
clustering. It provides a mixture of Gaussian distribution 
for the background scene. 

The uniqueness of background scene of swimming pool 101 in a 
data analysis context lies in the fact that the image data 

10 
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are strongly correlated- Clusters in the joint domain 
correspond to underlying contiguous regions within the 
image, the recovery of which has been achieved using the 
mean shift procedure. Figure 4 shows the 3D scatter plot 402 
5 of the scene 400 of the swimming pool 101 in the RGB color 
space and its segmentation 401. The image data are assigned 
to clusters using Euclidean distance. Each cluster is 
assumed to be a multivariate Gaussian characterized by its 
mean value and covariance matrix. 

10 

2 . Segmentation of swimmers 

Swimmers in the scene are characterized by a relatively 
large deviation from the background statistics. Changes in 

15 the model scene are computed at every frame by comparing 
current frames with the model. With the set of N clusters 
(Ci)i-i...N that model the background scene at time t, a 
similarity measure is computed between the incoming frame 
and the background using the normalized Mahalanobis 

20 distance. This provides an estimate of the similarity with 
the background at every pixel. The distance measure of a 
pixel in the current frame is: 

Dp (X t ) =arg min (D p (X t \ C± : i «1 . . . N) ) , 
25 i 

D p (X t \ d )- In IZi,tl+ ( X t - ju i,t ) T Z " J i, t ( X t - n itt ) , 

Where D p (X t ; measures the lowest difference between a 
cluster centroid and the projection of the pixel's observed 
30 value on the sub-space spanned by the cluster; Ui, t and I i/C 
are the mean value and covariance matrix of the i th cluster 
at time t. 



11 
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Thus, if a region of the image is sufficiently dissimilar 
from the modeled background, the system will consider it to 
be a swimmer. A binary map is then formed for the current 
frame, in which regions corresponding to the swimmers are 
5 highlighted as motion blobs. Basic morphological operations 
(erosion-dilation) [2] are applied to remove small noise 
regions and to fill holes in these regions of interest. 
However, some segmentation errors will still persist due to 
reflections (or ripples) on the surface of the water, which 
10 are not explicitly included into the background model. By 

including a-priori knowledge of the swimmers, our system is 
able to further suppress this type of false segmentation. 

In order to differentiate reflections on the surface of 
water and swimmers, we take a time series of intensity 
values for two types of pixels. One type of pixels has the 
occurrence of reflections only (with no swimmers, compare 
first time series 500) while swimmers pass through the other 
type of pixels, compare second time series 501. Figure 5 
shows the typical time series 500, 501 of these two types of 
pixels. Clearly, there is a distinctive contrast between 
swimmers and reflections in term of intensity changes. 
Swimmers generally cause a significant decrease in intensity 
of the background scene, while reflections and/or ripples 
make the scene brighter. As the pool 101 mainly contains 
water, the overall brightness of the scene B s can be obtained 
from the mean value of the cluster having the largest sample „ 
size. We then impost another constraint on the segmentation 
of the swimmers: B 5 -It > T B , where I t is the brightness of 
the pixel in the current frame and T B is the conveniently 
chosen threshold. The binary map in which foreground regions 
contains the swimmers, is represented by: 

12 
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bi (t) =U{[D p (X t ) >r 7a IBs - It >T B J ). 

where " denotes the logical 1 AND 1 operator. Thus, the binary 
5 map for the frame at time t is defined by the locations 
where the difference from the background model is greater 
than a given threshold T and the brightness level is lower 
than the overall brightness of the pool by the threshold T B . 

10 Updating the background global statistical model 

Since our background model gives the global description of 
the background scene, the update of the model mainly caters 
for the changes in overall lighting condition. For each 
15 frame considered, the existing Gaussian clusters for the 

background model are updated with the color values of pixels 
that are not classified as swimmers. The background pixels 
are assigned to the respective nearest clusters according to 
the normalized Mahalanobis distance. 

20 

The parameters fx i/t and Z i/t of i th cluster which matches the 
set of N new samples {X K } are updated as follows: 

Pi,t = (1 " P)Pi,t-i + P X i,t 
25 E i/t = (1 - p) 2i, t -l + pCi,t 

where 




30 and p is the learning factor for adapting current 

13 



8NSDOCID:<WO 02097758A1 I > 



WO 02/097758 



PCT/SG02/00105 



distribution of i th cluster. In our implementation, the 
learning factor is a constant to provide faster Gaussian 
tracking at some expense of accuracy. 

5 At the end of this segmentation module, the system 

represents each of the detected swimmer with a set of 
attributes, including an identity label, size, colour, 
centroid position and major orientation of the segmented 
swimmer represented by the major and minor axes of a best- 
10 fit ellipse 600 to 603 around the swimmer. Figure 6 shows 
detected swimmers in a swimming pool 101 with the 
superimposed best-fit ellipses 600 to 603. 

This entire detection and tracking process is summarized as 
15 follows: 

1. Convert the original RGB color space into the HSV 
color space . 

20 2. Determine the area of interest to monitor 106 (water 

area) during system setup. 

3. Compute a global statistical model to represent the 
background of the swimming pool 101. During system setup, 

25 an estimated background scene 400 to 402 for the swimming 

pool 101 is computed by observing the empty pool without 
any swimmer for some time, and pre-stored. 

4. Detect swimmers by locating regions with large 

30 deviation from the background global statistical model. 

5. Apply a series of morphological erosion and dilation 
operations [2] to remove isolated noise pixels obtained 

14 
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above . 

6. Connect the changed pixels into a contiguous 
foreground region, called blobs, using the connected 

5 component analysis algorithm [2] • 

7. Update the background global statistical model. 

8. Enclose each newly detected blob with a best-fit 
10 ellipse 600 to 603; 

9. Label and count the blobs inside the monitored area. 

After the swimmers are detected, the swimmers will be 
15 tracked using a multi-swimmer tracking module. This module 
uses a Kalrnan filter based multiple hypotheses tracking 
algorithm that incorporates color, position and size as the 
matching features. The system initiates a Kalrnan model for 
each detected swimmer. At each frame, an available pool of 
20 Kalrnan models are used to identify the detected swimmers 
with respect to the previously detected swimmers in the 
previous frame (this process is called correspondence) . When 
unambiguous correspondence between a model and a swimmer can 
be established, the model will be updated using the latest 
25 information of that swimmer. Models that cannot be used to 

explain any detected swimmer within a certain period will be 
removed. In that case, the system assumes swimmers 
corresponding to those models have left the monitored area. 

30 The multi-swimmer tracking process is as follows: 

1. A region corresponding to the pool area will be 
designated in each view of the video camera 100 so that 

15 
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the computer 102 can easily know the area of interest 
(AOI) (monitoring region 106} to monitor. This will be 
done only once during system setup; 

2. The tracking module will monitor for swimmers entering 
into the AOI (monitoring region 106); 

3. From the blobs detected using the above method, 
attributes of the swimmers such as the position and 
velocity will be extracted to form a tracking attribute 
vector, a. For example, if the tracking attributes used 
are position, p, and velocity, v, then the vector a for 
swimmer i will be in the form: 

3i= (Pi, vO 

4. Compute the matching score matrix, M. Each entry Sij 
in M is the inverse weighted sum of the Euclidean 
distance, D ±j between the 1 th blob's tracking attribute 
vector, a ± , in the current frame and the j £h swimmer's 
tracking attribute vector, aj, in previous frame and the 
Euclidean distance Dij of a± in the current frame and the 
predicted tracking attribute vector, a 5 , of the j th 
swimmer. Therefore, the matrix, M, measures the likelihood 
of blobs in the current frame corresponding to swimmers in 
previous frame. Assuming that there are m swimmers in the 
current frame and n swimmers in the previous frame, then 
the matrix M will have the form: 



16 
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>\2 



where 



1 



■5 D^-flai - a-jfl, .and Dij = ||ai - aj||, with 

|| x || denotes Euclidean norm of x , 
i = 1 ... m and j =1 ... n . 

5. Establish the one-to-one correspondence of swimmers 
10 between the current and previous frames by obtaining the 

scalar product of two matrices M 2 and M 2 , both of which are 
derived from the matching score matrix, M. M 2 have entries 
equal to 1 corresponding to the highest blob-to-swimmer 
matching scores and other entries equal to 0 for the 

15 current frame while similarly, M 2 have entries equal to 1 

corresponding to the highest swimmer-to-blob matching 
scores and other entries equal to 0 for the previous 
frame. The scalar product of M 2 and M 2 creates a matrix M 3 , 
whose nonzero entries indicate the unambiguous 

20 correspondences between swimmers; 

6. Analyze the locations of any unmatched blobs. If it is 
sufficiently close to one of the existing swimmers and 
appear away from the boundary of the AOI then the 

25 unmatched blob will be merged into the nearest swimmer. 

Otherwise, the unmatched blob will be added to the list of 
newly appearing swimmers if the system manages to track 
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them over a few frames. If the new blob disappears after a 
few frames, then the new blob will be considered as noise 
and will be dropped; 

5 7. Feed the blob 1 s information into a Kalman filter [3] 

to obtain the swimmer 1 s predicted attributes for the next 
frame . 

Once the swimmers are successfully tracked, the multiple- 
10 swimmer tracking module will extract the attributes of the 
tracked swimmers. For each swimmer, multiple attributes are 
extracted, such as the spatial location of the centroid of 
the swimmer, body orientation and size. From the temporal 
sequence of these attributes, other attributes such as the 
15 rate of orientation change, moving directions, motion 

symmetry, regularity of motion, sudden change in swimming 
pattern and water ripple patterns can be obtained. By 
learning the temporal model of these attributes, the system 
will compute an overall score for each swimmer to determine 
20 whether the swimmer is normal or at risk of drowning 
(including cases of distress and early drowning) . 

The analysis process of the temporal model is based on the 
optimal filtering of past measurements. The state flow 

25 diagram 700 of the detection of potential drowning is given 

in Figure 7. The system will consider the swimmer to be in an 
abnormal condition if the system fails to give good 
prediction of the swimmer's attributes. This is illustrated 
using the rate of orientation change of the swimmer's body as 

30 an example. A sample plot 800 is shown in Figure 8. As can be 
seen, starting from around Frame 350, the body orientation of 
the swimmer changes much faster and is irregular. The fast 
and irregular change in body orientation serves as an 
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indication to the breach of predictability of the swimming 
pattern. Figure 9 shows the rate of orientation change 900. 
This plot 900 is obtained from Figure 8 using the following 
equation : 

5 r(t) = — — , where T is the preset length of temporal 

window . 

Once the rate of orientation change 900 is larger than a 
preset threshold, the system will not be able to accurately 
predict the body orientation in the new frame using prior 
10 measurements. Therefore, a breach of predictability of the 
swimming pattern is detected in this case, and becomes one 
possible good indicator of an abnormal condition. The 
occurrence of several abnormal conditions together would be 
a good indication of a swimmer at risk of drowning. 

15 

The extraction of relevant features and their interpretation 
is most crucial for automatic detection of swimmers at risk 
of drowning. These features are an important aspect of this 
invention and besides the rate of change of orientation 900 
20 described above, other features will now be described in 
more detail : 

a) forward motion of the swimmer; 

25 This attribute considers a swimmer to be in an abnormal 

condition if the swimmer is not moving forward but there 
is detected movement of the arms. This attribute is 
characterized by the spatial location of the centroid of 
the swimmer not changing beyond a preset boundary, given 

30 by: 
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|| v c - Vp || =s D > where 

v c = spatial location of centroid at current frame 
v p - spatial location of centroid at previous frame 
5 Vi = {Xi,yi} , with the x, y coordinates in the image 

Anin = threshold of the change to consider as not moving 
forward. 

The rate of motion will also be considered.. An abnormal 
10 condition arises if the motion slows to almost a halt. 

b) posture of the swimmer, whether upright or just slightly 
leaning; 

This attribute considers a swimmer to be in a 
potentially abnormal condition if the posture of the 
body is upright. This is characterized by the major axis 
of the ellipse 600 to 603 being vertical or close to 
vertical . v 

c) size of the swimmer's body above the water; 

This attribute considers a swimmer to be in a 
potentially abnormal condition if the size of the body, 
25 inclusive of the head that is not submerged in the 

water, is reducing or that increase in size is not 
detected for a period of time. This feature can be 
characterized by the change in the area of the best-fit 
ellipse 600 to 603. 

30 

d) path of the swimmer's movement; 

This attribute considers a swimmer to be in a 

20 
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potentially abnormal condition if there is a significant 
change in the path taken by the swimmer as predicted 
from the past frames. If the body is in an upright 
position, the path to be checked could include up-down 
5 movement. The path can be obtained from the plot of the 

centroid over time. The best fit curve of the plot gives 
the path taken by the swimmer and the deviation of the 
path is seen as a change in the best fit curve or the 
presence of a deflection point. 

10 

e) motion symmetry of the swimmer; 

This attribute considers a swimmer to be in a 
potentially abnormal condition if the motion of the 
swimmer does not show any symmetry. An example in which 
this attribute can be obtained is by dividing the image 
into two along the major axis of the ellipse 600 to 603, 
then flip one of the images along the axis and compute 
the correlation [2] between the two. If the value of the 
correlation is small, then there isn't much symmetry of 
the motion of the swimmer. 

f) periodicity or repeatability of the swimmer 1 s 
movement pattern; 

This attribute considers a swimmer to be in a 
potentially abnormal condition if the motion of the 
swimmer does not show any periodic or repeatable 
pattern. An example where this attribute can be obtained 
is by normalizing the image extracted from the best-fit 
ellipse 600 to 603 and then computing the cross- 
correlation [2] of these images over different frames. 
If the value of the correlation is smaller than a 
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predetermined threshold, then no repeatability of the 
motion is detected. 

g) ripple pattern in the surrounding area of the 
5 swimmer; 

This attribute considers a swimmer to be in a 
potentially abnormal condition if the ripple surrounding 
the swimmer is more violent than normal- This attribute 
10 is characterized by the overall brightness of the water 

surrounding the swimmer. If the overall brightness 
increases beyond a certain threshold over the average of 
the water, then abnormal ripple is considered present. 

15 In all the above indicators, the presence of several abnormal 
conditions together will serve as an indication of a swimmer 
at risk of drowning. For example, one of the considerations 
is that if the body orientation is vertical and there is no 
forward motion and that the motion is not symmetrical, then 

20 the swimmer is considered at risk of drowning. Once such 

condition is detected, the person in charge will be alerted 
and the video signal 105 and audio signal 104 corresponding 
to the area of the swimmer will be made available to the 
person in charge. Figure 10 shows the overall process flow 

25 1000 of the proposed system. 

Summarizing, this invention describes an audio-visual based 
method and system for early drowning detection system. In 
this invention, a number of cameras 100, 200, 201 are 
30 mounted on top of a swimming pool 101. These cameras 100, 
200, 201 are used to monitor swimmers in the pool 101 
together with the aid of an array of microphones 103. 
Similarly, the microphone array 103 is mounted above the 
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water to cover the entire swimming pool 101. Based on the 
motion and activity of the swimmer detected through the 
video camera 100, 200, 201, the swimmer' s condition is 
automatically analyzed. Such automated video analysis 
5 includes building the visual background model of the pool 
101, detecting the presence of swimmers in the monitored 
areas 106, estimating the number of swimmers inside the 
monitored area 106, tracking the swimmers and analyzing the 
behavior of each tracked swimmer in terms of body 

10 orientation, moving direction and motion patterns. In 

addition, the microphone array 103 is deployed to pick up 
audio signals 104 originating from distress calls. Once the 
system detects the presence of a potential drowning, both 
visual and audio alarms will be activated to draw the 

15 attention of the person in charge for further confirmation 
and if necessary to provide necessary follow-up rescue 
actions . 
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List of reference signs 



100 video camera 

101 swimming pool 
5 102 computer 

103 microphone 

104 audio signal 

105 video signal 

106 monitored area of interest 
10 200 video camera 

201 video camera 
300 histogram 

400 background scene 

401 segmentation 

15 402 3D scatter plot 

500 first time series 

501 second time series 

600 best-fit ellipse 

601 best-fit ellipse 
20 602 best-fit ellipse 

603 best-fit ellipse 
700 state flow diagram 
800 sample plot 

900 plot showing the rate of orientation change 
25 1000 overall process flow 
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Claims 

1. A method for monitoring a swimming pool (101) to 
indicate a risk of drowning of swimmers, the method 
5 comprising: 

taking a plurality of subsequent images of a monitoring 
region (106) at least partly containing a water surface of 
water contained in the swimming pool (101) by means of a 
camera (100) outside the water at a plurality of 
10 predetermined subsequent moments in time, 

in each image, detecting the presence of swimmer image 
portions each of which shows a swimmer present in the image, 

processing each detected swimmer image portion, so as to 
assign to the respective swimmer image portion a 
15 characteristic two-dimensional geometrical figure, wherein 

the geometrical figure characterizes at least one geometrical 
attribute, 

assigning to each geometrical figure a figure position 
of the geometrical figure, wherein the figure position 

20 corresponds to a position of the detected swimmer image 
portion in the image, 

for at least one pair of two subsequent images, i.e. of 
a presently processed present image and a previously 
processed previous image, comparing values of at least one 

25 out of the figure position and the at least one geometrical 

attribute of the present image to that of the previous image, 
so as to detect a change in the figure position/ geometrical 
attribute of the present image as compared to the previous 
image, 

30 based on the detected change in the figure position/ 

geometrical attribute of the subsequent images, assigning to 
the corresponding swimmer either a drowning condition 
indicating that there is a risk of drowning of the swimmer or 
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a safe condition indicating that there is no risk of drowning 
of the swimmer, and 

outputting an output signal if £ drowning condition is 
assigned to the swimmer. 

5 

2. The method according to claim 1, wherein the at least 
one geometrical attribute comprises at least one attribute 
out of: 

a two-dimensional area of the geometrical figure 
10 corresponding to an area of the corresponding swimmer' s body 
visible to the camera (100), 

a shape of the geometrical figure corresponding to an 
approximate contour of the corresponding swimmer's body 
visible to the camera (100), 
1$ an orientation of an axis of the geometrical figure, 

corresponding to an orientation of an axis of the 
corresponding swimmer's body visible to the camera (100), 

a length of an axis of the geometrical figure, 
corresponding to a length or width of the corresponding 
20 swimmer's body visible to the camera (100), and 

a symmetry condition of the geometrical figure, 

3. The method according to claim 1, 

wherein the characteristic two-dimensional geometrical 
25 figure is a best-fit ellipse (600 to 603) by which a contour 
of the respective swimmer image portion is approximated, a 
major axis, a minor axis and a centroid (center) of the best- 
fit ellipse (600 to 603) are adjusted to best approximate 
said contour of the swimmer. 



30 



4. The method according to claim 3, wherein the at least 
one geometrical attribute comprises at least one attribute 
out of: 
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an area of the best-fit ellipse (600 to 603), 

an. orientation of the major axis of the best-fit ellipse 
(600 to 603) , 

an orientation of the minor axis of the best-fit ellipse 

5 (600 to 603) , 

a length of the major axis of the best-fit ellipse (600 

to 603) , 

a length of the minor axis of the best-fit ellipse (600 
to 603>, 

10 a symmetry of the swimmer's body visible to the camera 

(100) with respect to the major axis of the best-fit ellipse 
(600 to 603) , and 

a symmetry of the swimmer's body visible to the camera 

(100) with respect to the minor axis of the best-fit ellipse 

15 (600 to 603) . 

5. The method according to claim 1, further comprising: 
for each image, forming a binary map of the image, 

wherein each detected swimmer image portion is represented by 
20 a highlighted blob in the binary map and wherein the surface 
image portion is represented by dark regions in the binary 
map . 

6. The method according to claim 1, 

25 wherein the images are video frames taken over a 

predetermined length of time by means of a video camera 
(100) . 

7. The method according to claim 1, wherein 

30 a moving velocity of the figure position corresponding 

to a moving velocity of corresponding swimmer is determined 
from the detected change in the figure position and 
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a possible drowning condition is assigned to the swimmer 
if moving velocity of the figure position is lower than a 
predetermined moving velocity threshold. 

5 8, The method according to claim 2, wherein 

a possible drowning condition is assigned to the swimmer 
if the orientation of the axis of the swimmer's body is 
vertical for a predetermined certain number out of a 
predetermined total number of subsequent images or that the 
10 rate of change of the orientation (900) is higher than a 

predetermined threshold for the rate of orientation change 
(900) . 

9. The method according to claim 2, wherein 

15 a possible drowning condition is assigned to the swimmer 

if the detected area of the geometrical figure is lower than 
a predetermined area threshold for a predetermined certain 
number out of a predetermined total number of subsequent 
images . 

20 

10. The method according to claim 1, wherein 

a movement path of the figure position corresponding to 
a movement path of the corresponding swimmer is determined 
from the detected change in the figure position of a 
25 plurality of subsequent images and 

a possible drowning condition is assigned to the swimmer 
if the movement path of the figure position shows at least 
one deviation from an expected normal feature obtained by 
predicting the movement path. 

30 

11. The method according- to claim 1, wherein 

a movement periodicity of the geometrical figure 
corresponding to a movement periodicity of corresponding 
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swimmer is determined from the detected change in the figure 
position and/or at least one geometrical attribute of a 
plurality of subsequent images and/or repeatable pattern of 
the swimmer's body visible to the camera (100) and 
5 a drowning condition is assigned to the swimmer if the 

movement periodicity of the figure position shows at least 
one predetermined abnormal feature. 

12. The method according to claim 1, further comprising: 

10 assigning to each geometrical figure a surrounding area 

of the geometrical figure corresponding to an area of water 
surrounding the corresponding swimmer, and 

assigning to the swimmer a drowning condition if the 
surrounding area shows at least one predetermined abnormal 

15 feature. 

13. The method according to claim 1, 

further comprising labeling each assigned geometrical 
figure, so as to distinguish different underlying swimmers. 

20 

14. The method according to claim 1, further comprising 
counting the number of assigned geometrical figures. 

15. The method according to claim 1, 

25 wherein a present assigned geometrical figure of a 

present image is assigned to a predetermined swimmer, and 
wherein a subsequent assigned geometrical figure of a 
subsequent image is assigned to the identical predetermined 
swimmer if the figure position of the subsequent assigned 

30 geometrical figure is sufficiently close to the corresponding 
figure position of the present assigned geometrical figure. 

16. The method according to claim 15, 
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wherein the predetermined swimmer is deemed to have left 
the monitoring region if, for a predetermined number of 
subsequent images, no subsequent assigned geometrical figure 
having a figure position which is sufficiently close to the 
5 corresponding figure position of the respective present 
assigned geometrical figure can be found. 

17, The method according to claim l f further comprising 
providing a gauge global statistical model of the swimming 

10 pool (101), the gauge global statistical model comprising at 
least a water surface model characteristic of the water, and 
wherein detecting the presence of swimmer image portions 
comprises : 

by processing of the image, generating a current global 
15 statistical model, the current global statistical model 

comprising at least a water surface model characterizing a 
surface image portion comprising the water surface and, if at 
least one swimmer is present in the image, at least one 
swimmer model characterizing at least one swimmer image 
20 portion comprising at least one swimmer, 

identifying the at least one swimmer model by a 
comparison of the current global statistical model to the 
gauge global statistical model, so as to detect the presence 
of at least one swimmer image portion. 

25 

18. The method according to claim 17, wherein the gauge 
global statistical model is generated from a previously 
processed image. 

30 19. The method according to claim 17, wherein the gauge 

global statistical model is generated from an image taken of 
the swimming pool (101) not containing any swimmers. 
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20. The method according to claim 17, further comprising: 
updating the gauge global statistical model by the 

current global statistical model, ignoring the swimmer model 
if any is present. 

5 

21. The method according to claim 17, 

wherein any surface image portion has mainly or in 
average a surface color and the at least one swimmer image 
portion has mainly or in average a swimmer color, and 
10 wherein the generating of the global statistical model 

comprises : 

for a predetermined plurality of moments out of the 
subsequent moments in time, converting respectively at least 
one image into a color-distribution in a four-dimensional 
15 color space spanned up by three primary colors and an 

brightness, wherein any surface image portion is converted 
into a surface cluster representing the surface color, and 
wherein any swimmer image portion is converted into a swimmer 
cluster representing the swimmer color, 
20 the converting of the image being performed by 

generating a histogram (300) of the image in each of the 
three primary colors, wherein each histogram (300) 
indicates the frequency distribution of the image of the 
swimming pool (101) in the respective primary color, and 
25 the brightness at each point in said color space 

being equal to a distance computed from the three 
frequencies of occurrence in the three primary colors at 
the respective point. 

30 22. The method according to claim 21, 

wherein the distance measure is computed using the 
Euclidian distance. 
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23. The method according to claim 21, 

wherein the color space is the RGB color space. 

24. The method according to claim 21, wherein the color 
5 space is the HSV color space. 

25. The method according to claim 21, wherein the at least 
one swimmer image portion is detected in that the swimmer 
color deviates from the surface color. 

10 

26. The method according to claim 21, further comprising: 
assigning to each geometrical figure a surrounding area 

of the geometrical figure corresponding to an area of water 
surrounding the corresponding swimmer, and 
15 assigning to the swimmer a drowning condition if the 

surrounding area shows a color which is sufficiently 
different from the swimmer color and which is sufficiently 
different from the surface color. 

20 21. The method according to claim 21, further comprising: 
after detecting the presence of swimmer image portions 
in a present image and before detecting the presence of 
swimmer image portions in a subsequent image, updating the 
water surface model of the gauge global statistical model by 

25 the water surface model of the current global statistical 
model of the present image. 

28. The method according to claim 21, 

wherein the detecting of the swimmer model is performed 
30 by computing a difference measure between the current global 
statistical model and the gauge global statistical model, 
wherein the swimmer model is identified if the difference 
measure is larger than a predetermined difference threshold. 
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10 



29. The method according to claim 28, 

wherein the similarity measure is computed using the 
normalized Mahalanobis distance. 

30. The method according to claim 21, 

wherein the images are video frames taken over a 
predetermined length of time by means of a video camera (100) 
and wherein a sequence of several subsequent video frames is 
used in generating the respective color distribution. 



31. A system for monitoring a swimming pool (101), the 
system comprising: 

at least one camera (100) being installed for taking a 
15 plurality of images of a monitoring region (106) at least 

partly containing a water surface of water contained in the 
swimming pool (101) at a plurality of subsequent moments in 
time, the camera (100) further being installed outside the 
water, and 

20 a computer (102) being coupled to the camera (100) to 

receive an image taken by the camera (100) and being 
installed to process the image, wherein the computer (102) 
comprises : 

a means for detecting the presence of swimmer image 
25 portions each of which shows a swimmer present in the image, 
processing each detected swimmer image portion, so as to 
assign to the respective swimmer image portion a 
characteristic two-dimensional geometrical figure, wherein 
the geometrical figure characterizes at least one geometrical 
30 attribute, 

a means for assigning to each geometrical figure a 
figure position of the geometrical figure, wherein the figure 
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position corresponds to a position of the detected swimmer 
image portion in the image, 

a means for comparing, for at least one pair of two 
subsequent images, i.e. of a presently processed present 
5 image and a previously processed previous image, values of at 
least one out of the figure position and the at least one 
geometrical attribute of the present image to that of the 
previous image, so as to detect a change in the figure 
position/ geometrical attribute of the present image as 
10 compared to the previous image, 

a means for assigning to the corresponding swimmer, 
based on the detected change in the figure position/ 
geometrical attribute of the subsequent images, either a 
drowning condition indicating that there is a risk of 
15 drowning of the swimmer or a safe condition indicating that 
there is no risk of drowning of the swimmer, and 

a means for outputting an output signal if a 
drowning condition is assigned to the swimmer. 

20 32. The system according to claim 31, 

further comprising a display means being coupled to the 
computer (102) for displaying a display signal which is 
dependent on the output signal. 

25 33. The system according to claim 32, 

wherein the display means is an alarm means and wherein 
the display signal is a signal which is suited for driving 
the alarm means. 

30 34. The system according to claim 32, 

wherein the display means is a computer monitor and 
wherein the display signal is a signal which is suited for 
being displayed on the computer monitor. 
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35. The system according to claim 31 , 

wherein a plurality of cameras (100, 200, 201) is 
provided as the at least one camera (100, 200, 201), wherein 
5 each camera (100, 200, 201) is installed for taking a 

plurality of images of a predetermined sub-region of the 
monitoring region at subsequent moments in time, wherein 
images of different sub-regions are taken by different 
cameras (100, 200, 201) . 

10 

36. The system according to claim 35, 

wherein adjacent sub-regions partly overlap. 

37. The system according to claim 31, 

15 further comprising at least one microphone (103) for 

detecting sounds such as distress calls from swimmers. 

38. A system for monitoring a swimming pool (101), the 
system comprising: 

20 at least one camera (100) being installed for taking a 

plurality of images of a monitoring region (106) at least 
partly containing a water surface of water contained in the 
swimming pool (101) at a plurality of subsequent moments in 
time, the camera (100) further being installed outside the 

25 water, and 

a computer (102) being coupled to the camera (100) to 
receive an image taken by the camera (100) and being 
installed to process the image, wherein the computer (102) 
comprises : 

30 a means for detecting the presence of swimmer image 

portions each of which shows a swimmer present in the image, 
processing each detected swimmer image portion, so as to 
assign to the respective swimmer image portion a 
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characteristic two-dimensional geometrical figure, wherein 
the geometrical figure characterizes at least one geometrical 
attribute, 

a means for assigning to each geometrical figure a 
5 figure position of the geometrical figure, wherein the figure 
position corresponds to a position of the detected swimmer 
image portion in the image, 

a means for comparing, for at least one pair of two 
subsequent images, i.e. of a presently processed present 
10 image and a previously processed previous image, values of at 
least one out of the figure position and the at least one 
geometrical attribute of the present image to that of the 
previous image, so as to detect a change in the figure 
position/ geometrical attribute of the present image as 
15 compared to the previous image, 

a means for assigning to the corresponding swimmer, 
based on the ^detected change in the figure position/ 
geometrical attribute of the subsequent images, either a 
drowning condition indicating that there is a risk of 
20 drowning of the swimmer or a safe condition indicating that 
there is no risk of drowning of the swimmer, and 

a means for outputting an output signal if a 
drowning condition is assigned to the swimmer, 

25 wherein the means for detecting the presence of swimmer image 
portions comprises : 

a means for generating a global statistic model 

from a taken image, 

a storage means for storing global statistic 

30 models, and 

a comparison means for comparing a current global 
statistic model to a gauge global statistic model which is 
stored in the storing means. 
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39. The method according to claim 1, further comprising the 
step of differentiating a swimmer from ripples on the surface 
of the water based on contrast information. 

40- The system according to claim 31 or 38, further 
comprising means for differentiating a swimmer from ripples 
on the surface of the water based on contrast information. 
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